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Abstract 

Much of the existing work on the broadcast channel focuses only on the sending of 
private messages. In this work we examine the scenario where the sender also wishes 
to transmit common messages to subsets of receivers. For an L-user broadcast channel 
there are 2 L — 1 subsets of receivers and correspondingly 2 L — 1 independent messages. 
The set of achievable rates for this channel is a 2 L — 1-dimensional region. There are 
fundamental constraints on the geometry of this region. For example, observe that if 
the transmitter is able to simultaneously send L rate-one private messages, errorfree 
to all receivers, then by sending the same information in each message, it must be able 
to send a single rate-one common message, errorfree to all receivers. This swapping of 
private and common messages illustrates that for any broadcast channel, the inclusion 
of a point R* in the achievable rate region implies the achievability of a set of other 
points that are not merely componentwise less than R*. We formerly define this set 
and characterize it for L = 2 and L = 3. Whereas for L = 2 all the points in the set 
arise only from operations relating to swapping private and common messages, for 
L = 3 a form of network coding is required. 



1 Introduction 

The broadcast channel has predominantly been studied in the context of unicast messaging, 
where the transmitter wishes to send one private message to each of the L receivers (see [1] 
for example). We refer to this as unicasting. The transmitter may however wish to send 
different messages to different subsets of receivers. We refer to this as multicasting. The 
most general multicast structure comprises of 2 L — 1 messages (the powerset). For L = 2 
there are three messages, one required only by the first receiver, one required only by the 
second receiver, and one required by both receivers. 

The multicast capacity region for a broadcast channel is the set of 2 L — 1-dimensional 
rate vectors that are achievable. For L = 2 this is the set of achievable rate vectors 
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(R%, i? 2 , -R12)) where Ri 2 denotes the rate of the common message. One question of interest 
is, can the multicast capacity region be inferred from the unicast capacity region? That 
is, can we always compute the multicast capacity region from the unicast capacity region, 
i.e. without knowing the structure of the channel? For certain broadcast channels this 
is true, although it is not true in general. Thus the multicast capacity region provides 
additional information about the communication limits of the channel beyond that of the 
unicast capacity region. 

Multicasting has received significant attention in the network- coding literature. In [2] 
and [3] the maximum rate at which a common message can be sent from a source node 
through a network of directed noiseless links to a collection of sink nodes, is shown to equal 
the minimum-cut of the associated graph. In [3] and [5] the multicast capacity region for 
one-source-two-sink networks is fully characterized^ It is again given by the minimum-cuts 
of the associated graph. For three or more sinks this is not the case and the problem is 
open. In this exposition we shed light on it by characterizing some of the structure for 
three-sink networks. 

There is an oddity to multicasting. Suppose we have a two-user broadcast channel that 
can support a rate vector (1, 1, 1). That is the transmitter can simultaneously deliver one 
bit of private information to the first receiver, one bit of private information to the second 
receiver, and one bit of common information to both receivers. An important point to clarify 
is that there is no secrecy requirement -"private" information sent to the first receiver may 
or may not be decodable by the second receiver and vice versa. Then the channel can also 
support a rate vector (2, 1,0). The transmitter merely uses the common bit to send private 
information to the first receiver. Ofcourse the second receiver is capable of decoding this bit 
too, but the information is of no interest to it. By symmetry the achievability of rate vector 
(1, 1, 1) also implies the achievability of rate vector (1, 2, 0). There is one more implication 
in this vein: the achievability of (1, 1, 1) implies the achievability of (0, 0, 2). The reasoning 
is similar. The transmitter sends the same information on the two private bits. In this 
way the first user receives the same private bit as the second user, in addition to the 
same common bit. Thus two common bits have been sent. These three manipulations are 
summarized in figure [Has extremal rays stemming from (1, 1, 1) and represent three distinct 
encoding/decoding operations that can always be performed, regardless of the structure of 
the broadcast channel. In this sense they are universal. By time-sharing one can achieve 
any point in the polytope indicated in figure [TJ To summarize: if a rate-vector (1, 1, 1) 
is achievable, so must be the region illustrated, regardless of the channel. Is this set of 
operations complete? Put in reverse, are there any rate vectors outside the polytope in 
figure [H that are achievable on for all broadcast channels for which (1, 1, 1) is achievable? 
The answer is that there are not -there exists a broadcast channel where the rate vector 
(1, 1, 1) is achievable, but no rate vector outside the polytope in figured] is. Thus for the 
two-user broadcast channels the three operations discussed form a complete set -they are 
the only distinct universal encoding/decoding operations. 

It is straightforward to generalize these operations to broadcast channels with an arbi- 

lr ro be more precise, we define the multicast capacity region of a network as the convex- hull of the 
union of all multicast capacity regions of broadcast channels that arise from specifying the encoding and 
decoding operations at intermediate nodes in the network. 



trary number of users. Consider for example the three user broadcast channel. There are 
seven messages. Suppose a rate vector R 2 , R3, R12, R13, R23, -R123) = (1, 1, 1, 1, 1, 1, 1) 
is achievable (for example, R13 represents the rate of the message intended for receivers 
1 and 3). Then for any two subsets of receivers I C J we can perform the operation 
Rx — ■> Rj + l,Rj — > Rj — 1, and for any two subsets of receivers X 7^ J we can perform 
the operation i?j — > R x — 1, — > Rj — 1, -Rjuj" - * -Rjuj" + 1- F° r instance we may swap 
the first and second receivers' private bits for a common bit that is sent to the pair, so that 
the rate vector (0, 0, 1, 2, 1, 1, 1) is achieved. Similarly the rate vector (0, 1, 1, 1, 1, 0, 2) can 
be achieved by using the first receivers private bit and the bit common to the second and 
third receivers, to send information common to all three receivers. It can be shown that the 
number of distinct operations of this form is 15. That is, if the rate vector (1, 1, 1, 1, 1, 1) 
is achievable, so is the set of points contained within a 15-edged polytope, which is the 
generalization to L = 3 of the polytope in figure [TJ 

Again we ask the question, is this set of operations complete? Are there any points 
outside this 15-edged polytope that are universally achievable on any three-user broadcast 
channel? The answer, perhaps surprisingly, is yes. There exists a sixteenth distinct uni- 
versal encoding/ decoding operation. It does not involve a mere relabeling of common and 
private bits. It enables the rate vector (1, 1, 1, 0, 0, 0, 3) to be achieved. This new operation 
together with the fifteen trivial ones forms the complete set of distinct universal encod- 
ing/decoding operations for L = 3. That is, all other rate vectors universally achievable 
from (1, 1, 1, 1, 1, 1, 1) can be achieved by time sharing between these 16 distinct universal 
encoding/decoding operations. 

Now we turn to the multiple access channel (MAC) with L users. The MAC has also 
typically been studied in the context of unicast messaging where it's capacity region has 
in many cases been completely characterized. For multicasting the capacity of the discrete 
memoryless MAC is computed in [6] and a conjecture regarding the generalization of this 
result to an arbitrary number of users is given. 

Let us apply the reasoning we applied above for the broadcast channel, to the MAC. 
Consider a two user MAC. Each transmitter wishes to send a private message of rate Ri 
to the receiver for i e {1,2}. In addition there is a common message of rate R12 that 
both transmitters share, and desire to be sent to the receiver. Suppose for a given MAC 
a rate vector (Ri, R2, R12) = (1,1)1) is achievable. Then the first transmitter could just 
label its rate-one bit stream as common information and send it to the transmitter. Thus 
the rate vector (0, 1, 2) is also achievable. By symmetry the second transmitter could 
do the same so (1,0,2) is achievable too. Are there any other operations that tradeoff 
between elements of the rate vector The answer is no. For the broadcast channel we 
could swap common information for private, but not so for the MAC. More specifically we 
cannot relabel common information as private, as a common bitstream may require both 
transmitters have access to it in order for it to be passed to the receiver. A private bitstream 
assumes only a single transmitter has access to it. The (1,1, l)-multicast region for the 
two-user MAC is plotted in figure [2j There are three extremal rays and correspondingly 
three distinct universal/encoding decoding operations. The first two are stated above and 



2 We could combine these two arguments to conclude (0, 0, 3) is achievable but we will not be interested 
in this operation as it can be expressed as a linear combination of others. 
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Figure 1: The (1, 1, l)-multicast region for the broadcast channel, L = 2. 

the third consists of merely lowering the common rate so as to arrive at the point (1, 1, 0). 

Unlike the broadcast channel, this structure directly generalizes to L users. For three 
users there are ten universal encoding/decoding operations. Six result from relabeling 
private information as pairwise. Three result from relabeling pairwise as common and the 
last results from lowering the common rate. Thus the multicast capacity region of the 
multiple access channel has a less intricate structure than that of the broadcast. 

In this paper we characterize the complete set of distinct universal encoding/decoding 
operations and the associated region of achievable rate vectors, for both the broadcast 
channel and the MAC channel, for L = 3. In essence this is a characterization of the 
universal constraints on the multicast capacity region of these channels. 

Section II describes the notation we use. In section III we describe the problem in 
detail. Section IV presents the results and section V and VI the proofs. 

2 Preliminary Notation 

We briefly describe some of the notation that will be used. Typically I and J will be 
used to denote subsets of {1,2,3}. For example we may have X = {2,3}, which would 
imply Rj = R{2,3} = -fiW Rates in bold font represent tuples, for example we may have 
R = (Ri, R 2 , R12). Elements of time series are indicated by a index in parentheses following 
the variable, for example Y{%). An entire time series is represented by bold font, for example 
Wi = [Wi(l), . . . , Wi(n)] If S is a set then 2 denotes the powerset (the set of all subsets 
of S) excluding the nullset, e.g. if S = {1, 2} then 2 s = {{!}, {2}, {1, 2}}. We denote the 
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Figure 2: The (1, 1, l)-multicast region for the multiple access channel, L = 2. 
nullset by 0. The symbol ^ denotes element-wise inequality. 

3 Problem Setup 

Consider a broadcast channel with three receivers. The input alphabet is denoted X and 
the output alphabets 3^1,3^2,3^3- The probability transition function is 2/2, 2/3^) ■ The 
message vector is 

(W ± , W 2 , W 3 , W 12 , W 13 , W 23 , W 123 ). 

The subscript denotes the subset of receivers for which the message in intended, for 
example message W 23 is intended for receivers 2 and 3. Denote the rate vector R = 
R 2 , R 3 , i?i2, -R13, -R23, #123)- A (2 nR , n) code consists of an encoder 

x n : Yl {l,---,2 ni?I } -> X n 

XC{1,2,3} 

and twelve decoders 

Wi,j : y? -> 2 ni?I 

where i G {1,2,3} denotes the receiver and X C {1,2,3} with i 6 I denotes the message in- 
dex. Thus each receiver decodes four messages (the first receiver decodes W\, W\ 2 , W\ 3 , Wi 23 , 
etc.). The probability of error is defined to be the probability that at least one of 
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Figure 3: System diagrams for L = 2. 
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where the seven messages are assumed to be mutually independent and uniformly dis- 
tributed over rix 6 {i ) 2 1 3}{ 1 > • • • > 

Definition 3.1. A multicast rate vector is said to be achievable for the broadcast channel 
if there exists a sequence of (2 nR ,n) codes with — > 0. 

Definition 3.2. The multicast capacity region of the broadcast channel is the closure of 
the set of achievable multicast rate vectors. It is denoted C p ( yimjyz \ x ) or C for short. 

Often we will omit the adjective 'multicast'. 

We now give a defintion that makes precise the operation of swapping common and 
private messages, and quantifies the change in the rate vector. Let H w and R M be two 
rate vectors. 



Definition 3.3. A (dH,n) -universal encoding/decoding operation is a pair of mappings 
Wj: J] {l,...,2"^}-{l,...,2«^},anrf 

XC{1,2,3} 

M hI : J] {1,...,2^}-,{1,...,2^} 

J C {1,2,3} 
s.t. i g J 

for all J C {1,2,3} and all i G {1,2,3} /or all X C {1,2,3} suc/i t/iai i 6 I, mzi/i i/je 
properties M i|X = M i)X /or a// 1, j G {1, 2, 3} ; R M ^ TL W and 

K m _ R w 



| R M_ R W| 



dR, 



W(M) being the universal encoder output and M(W) being the universal decoder output. 
The vector dR is referred to as the 'normalized difference vector'. 

The property M^j = Mj t x for all i,j G {1, 2, 3} ensures that all users agree on the common 
messages they decode. See figure [3] for a system diagram that illustrates the relationship 
between M, W, M and W. 

Example 3.4. Suppose K w = (1,0,0,1,0,0,0) and R M = (2,0,0,0,0,0,0). Let n = 1. 
Then the mapping W\(WL) = Mi(l), W^M) = Mi (2) zs a universal encoding operation 
with dR = (1, 0, 0, —1, 0, 0, 0)/\/2. The universal decoding operation is the inverse mapping 
given by M(W) = [Wi, W12]. 

Definition 3.5. A dTi-universal encoding/decoding operation is called 'distinct' if the vec- 
tor dR cannot be expressed as positive linear combination of vectors {dRj} ^ dR for 
which there exist dUi-universal encoding/decoding operations for i = 1,2, ... . The (rays 
associated with the) normalized difference vectors corresponding to distinct dH-universal 
encoding/decoding operations are called 'extremal rays'. 

By positive linear combination we mean a weighted linear sum with non-negative coeffi- 
cients. 

Example 3.6. It will be evident later that the universal encoding/decoding operation given 
in example \3.4\ is distinct and thus (1, 0, 0, — 1, 0, 0, 0)/a/2 is an extremal ray. By sym- 



metry (0, 1, 0, — 1, 0, 0, 0)/v^2 is also an extremal ray. Note distinctness does not imply 
uniqueness -the universal encoding/decoding operation that moves from rate vector H w = 
(1,0,0,1,0,0,0) to rate vector R M = (1.5,0,0,0.5,0,0,0) is also classified as distinct, but 
it has the same normalized difference vector. An example of a universal encoding/decoding 
operation that is not distinct is one that moves from rate vector H w = (1, 0, 0, 1, 0, 0, 0) 
to rate vector H M = (1.5,0.5,0,0,0,0,0). Denote the corresponding normalized difference 
vector is dR^ = (0.5, 0.5, 0, — 1, 0, 0, 0)/vL5. The universal encoding/decoding operations 
that achieve this shift correspond to time-sharing between two operations, one with normal- 
ized difference vector dR# — (1, 0, 0, — 1, 0, 0, 0)/\/2, the other with normalized difference 
vector dRc — (0, 1, 0, —1, 0, 0, 0)/V2. Indeed we have 

dR^ = — —dH,B H — ^=dRc. 
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Figure 4: Results for L = 2. 

We now give a formal definition of the region alluded to in figure [TJ Let 

R* / p* p* p* p* p* p* p* \ 
— \rii, Jt 2 , rt 3 , -ftx2' - rt 13) JX 2Zi - n, 123/ 

be a parameter. 

Definition 3.7. TTie 'R* -multicast region' is the intersection of the capacity regions of all 
broadcast channels for which the rate vector R* is achievable, i.e. 

p(yi m ,i/3|x):R* ec p(H1 ,y 2 ,y 3 \ X ) 

See figures [T] for examples of this region. 

As the problem setup for the multiple access channel is entirely analogous to the afore- 
mentioned setup for the broadcast channel, we do not explicitly describe it. An example 
of the R*-multicast region is given in figure [2] 

The aim of this paper is to characterize the R*-multicast cones for both the broadcast 
and multiple access channels. 



4 Results 

Theorem 4.1. For L = 3 the R* -multicast region of the broadcast channel is the set of all 
R G M 7 + satisfying 

G T BCt3 (R - R*) X (1) 

where Gbc,3 is given in figure This region is a polytope, characterized by the cone 
{R G R 7 : Gl C3 R ^0}. We refer to this cone as the L = 3 'multicast cone'. The sixteen 
extremal rays of this cone are given by the columns of the matrix Hbc,3 i n figure^ Thus 
there are 16 distinct universal encoding / decoding operations for L = 3. 

The (1, 1, l)-multicast region for the broadcast channel for L = 2 is illustrated in figure 
[TJ For L = 2 there are 3 distinct universal encoding/ decoding operations. The Gbc,2 and 
Hbc,2 matrices are given in figure HI 

The columns of Gbc,2 are the normal vectors to the three hyperplanes bounding the region. 
The columns of Hbc,2 are the three extremal rays (see figured]). 
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L = 3. 



Theorem 4.2. For L = 3 the R* -multicast region of the multiple access channel is the set 
of all R G M. 7 + satisfying 

G T mac,3 (R - R*) di (2) 

where Gmac,3 is given in figure \Q This region is also a polytope characterized by the cone 
{R G M7 : Gf IAC3 R d 0}. The 10 extremal rays of this cone are given by the columns 
of the matrix Hmac,3 i n fi9 ure \4\ Thus there are 10 distinct universal encoding / decoding 
operations for L = 3. 

The (1,1, l)-multicast region for the MAC for L = 2 is illustrated in figure [2j There 
are 3 distinct universal encoding/decoding operations. The Gmac,2 and Hmac,2 matrices 
are given in figure HJ 

An alternative interpretation of theorem 14.11 is the following (the same interpretation 
applies for |4.2p . For notational simplicity we denote the capacity region of an arbitrary 
broadcast channel by C. Let 

R*(a) = argmaxa T R 

Rec 

R*(a) is the rate vector lying on the boundary of the capacity region in the direction of a. 
Let 

C*{a) = {R G M. 7 + \a T R < a T R*(a) } . 

C*(a) is the halfspace of all rate vectors lying underneath the hyperplane a T R = a T R*(a). 
The region C is convex and thus we can characterize it by its support function C*(a), i.e. 

aeR 7 + 

However this is not the minimal dual representation of C. Let 

n;^ {aeRl\a T n B c, 3 do} 

Corollary 4.3. The multicast capacity region of an any broadcast channel with three re- 
ceivers can be expressed as 

C=f]C(a). 

OL&i 

if and only if 

n d hi 

This says the following: when computing the multicast capacity region of a broadcast 
channel by maximizing the weighted sum-rate, the smallest set that one need vary the 
weighting coefficients a over is TC^. Put another way, the normal vector a to any point on 
the boundary of the multicast capacity region is always contained in the set H^. See figure 
El 




Figure 6: The normal vector a of the broadcast channel capacity region satisfies a T G Ji* . 
(a) A capacity region that cannot occur, (b) A capacity region that can occur. 



5 Proof of Theorem 
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The direct part of the proof consists of showing that for any broadcast channel, if a rate 
vector R* is achievable then all rate vectors in the region given by equation (1) are achiev- 
able. This establishes that the R*-multicast region is 'at least as large' as the region given 
by equation (1). The converse part of the proof consists of illustrating, for each R* 6 M7 + , 
a broadcast channel for which no rate vector outside the region given by equation (1) is 
achievable. This establishes that the R*-multicast region is 'at least as small' as the region 
given by equation (1). We start with the direct part. For notational simplicity we drop the 
broadcast channel (BC) subscript. 

5.1 Direct Part 

Suppose that R* is achievable for a particular broadcast channel. We show that any rate- 
vector R G M. 7 + satisfying 

R ± R* + a BC:3 A (3) 

for A e M^ 6 is also achievable. We then show that this region is precisely the one given 
in equation (1). Let Aj denote the ith element of A and Hbc^z) denote the ith column 
of Hbc,3- To show that any rate- vector satisfying equation (2) is achievable, we show that 
each of the 16 rate- vectors given by 

R« = R* + H BC , 3 (i)A*, i = 1, . . . , 16 (4) 

are achievable where 

A* = max A 

H BC , 3 (i)Ai<R* 

By time sharing between these vectors the entire boundary region {R* — H_bc,3^| A G MJf} 
is achieved and hence any point within it (i.e. satisfying equation (2)) can also be achieved. 

Let M, M correspond to the binary message vector and estimate of the message vector, 
respectively, that the transmitter wishes to send at rate vector RW. We illustrate the 
achievability of equation (3) for i = 3. 

To universally encode for i = 3, assume without loss of generality that Rl < R%. In 
what follows we ignore rounding effects as it will be clear that in the limit n — > oo they are 
negligible. Set 

W? = [M 12 (l),...,M 12 (nRl)} 

W? = [M 12 (l), . . . , M 12 (nRt), M 2 (l), . . . , M 2 (nR* 2 - nR\)) 
W? 2 = [M l2 {nR{ + 1), . . . , M l2 {nR\ + nR\ 2 )\ 

In words, the information common to receivers 1 and 2 is split into two parts. The first part 
is replicated and sent separately down both receiver 1 and receiver 2's private channels. 
The second part is sent down the channel common to both receivers. As receiver 2's private 
channel can accommodate a higher bit-rate than receiver l's, there is some bandwidth left 
over. This is allocated to sending some of receiver 2's private information. 



For all other subsets J of {1, 2, 3} set Wj = Mj and Mj = Wj. Universal decoding is 
straightforward. The first receiver sets 



M« 12 = [w* w»] 

M"l3 = W?3 
^123 = 

and in this way successfully recovers its message, as the achievability of R* implies that W 
was decoded correctly. The second receiver sets 



Ml 2 



[W 2 {nR\ + l),...,W 2 {nR* 2 )\ 
[W 2 (l),...,W 2 (nRl),W? 2 ] 



M£ 23 = 
K, 123 = W? 23 



and is similarly successful in decoding. The third receivers sets M% 
messages. Then we have achieved a rate vector of 



W? for all of its 
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= R* + H BC)3 (3)A 3 . 

with A 3 = i?*. The universal encoding and decoding procedures for all other i e {1, . . . , 15} 
are similar and follow from the structure of the columns of the matrix Hbc,3- 

Universal encoding and decoding for i — 16 is different. Assume without loss of gener- 
ality that R* 12 < R* 13 < R* 23 . To encode, set W t n = Mf for i = 1, 2, 3 and 



w? 3 



= [M 123 (l),...,M 123 (nR* 12 )} 

= [M 123 (nR* 12 + 1), . . . , M 123 {2nR* 12 ), M 13 (l), M 13 (nR 13 - nR l2 )\ 
= [M 123 (l) © M 123 (nR* 12 + 1), . . . , M 123 {nR\ 2 ) ® M 123 (2nR* 12 ) , 
M 23 (l),...,M 23 (nR 23 -nR 12 )} 



W? 23 = [M? 23 (2nR* 12 ), . . . , M 1 " 23 (2n J Rt 2 + nR\ 23 )] 

In words, the information common to all receivers is split into three streams. The first and 
second are sent at rate R\ 2 using the three pairwise links. The third stream is sent at rate 
R\ 23 across the link common to all receivers. 

The first receiver decodes by setting M" x = VU™ and 

Mf 3 = [W 13 (nR* 12 + 1), . . . , W 13 (nR* 13 )\ 
M? 23 = [W? 2 ,W? 3 ,W? 23 }. 



The second receiver decodes by setting M™ 2 = W 2 and 



= [W 23 {nR\ 2 + 1), . . . , W 23 (nR* 23 )\ 
M? 23 = [W? 2 , W?2 © W? 23 ]. 

The third receiver decodes by setting M™ 3 = W 3 and 

Mf 3 = [W 13 (n^ 2 + 1), . . . , W 13 (nR* 13 )\ 
M% 3 = [W 23 (nR* 12 + 1), . . . , W 23 (nR* 23 )} 



M? 23 



[W? 3 ,W? 3 ®W? 3 ,W? 23 ] 



Then we have achieved a rate vector of 



R (3) = R * + 









R 



12 



= R* + H B c,3(16)A 16 . 

with A 16 = R\ 2 . Thus the 16 rate vectors satisfying equation (3) are achievable and by time 
sharing between them, all rate vectors in the region given by equation (2) are achievable. 

It remains to show that this region is equivalent to the one in equation (1), i.e. that for 
any R* G 

{ R G R 7 + 1 G£ C)3 R r< Gl C3 R] = { R G R + 1 R ^ R* + H BCj3 A, VA Glf}. 

On the left is the characterization of the polytope in terms of the hyperplanes bounding 
it. On the right is the dual characterization in terms of the edges of the polytope (1- 
dimensional facets). This equivalence can be demonstrated using computer software such 
as polymake. 



5.2 Converse Part 

To establish the converse we now present, for each R*, a particular (deterministic) broad- 
cast channel and show its capacity region is equal to (1). Let the input alphabet X = 
Y[jc{i 2 3}{0> • • • ' ^ nR * ~~ 1} w ith the ith channel input 

x{i) = [x 1 {i),x 2 {i),x 3 {i),x 12 (i),x 13 (i),x 23 {i),x 123 {i)} 

so that each X% G {0, . . . , 2 n ^-l}, and let y t G rijc{i,2,3},ie/{ ' • • • > 2^-1} for i = 1, 2, 3 
with 

Y 1 ({) = [X 1 (i),X 12 ({),X 13 {i),X 123 {i)} 
Y 2 (i) = [X 2 (i),X 12 (i),X 23 (i),X 123 (i)} 

Y 3 (i) = [x 3 (i),x 13 (i),x 23 (i),x 123 (i)\. 



bits/channel use 




> Y 3 ,i 23 (i) 



Figure 7: Illustration of the deterministic broadcast channel used in converse. 

See figure [7J for an illustration of the channel. Suppose the channel is used n times. The 
messages to be transmitted are Wj ~ U({1, . . . , 2 ni?I }) and mutually independent. Denote 
the length-n vector of channel inputs by X and the length-n vectors of channel outputs by 
Yi, Y 2 and Y 3 . Let Gbc,3(*) denote the ith column of Gbc,3- We wish to show 

G BC ,3(i) T R < G BC ,3(i) T R* (5) 

for i = 1, . . . , 15. Before this we introduce some notation. Suppose A is a collection of 
subsets of {1, 2, 3}, for example A = {1, 2, 12, 13, 123}. The collection A should be thought 
as the indices of a subset of the seven channel links (see figure [7]), for example A = {1, 123} 
corresponds to two links, the private one from X\ to Y^i and the common one from X123 
to Y1123, ^2,123, ^3,123- By L-4J we denote the indices of the messages intended for those 
receivers cut by A. For example if A = {1, 2, 12, 13, 123} then all links to the first receiver 
are cut, but not all links to the second or the third. As the first receiver is sent the 
messages Wi, Wu, W13 and W123, we have [A\ = {1, 12, 13, 123}. As another example let 
A = {2, 3, 12, 13, 23, 123}. Then all links to both the second and third receivers are cut and 
= {2, 3, 12, 13, 23, 123} = A. As a final example if A = {1, 12, 23, 123} no receivers are 
completely cut, and thus |_^4J = (f>. 

Lemma 5.1. Let Ai, A2 and A3 be three collections of subsets o/{l,2,3} such that either 
A x C A 2 U A 3 , A 2 C Ai U ^ 3 or A 3 C A!U A 2 . Then 

E R *+ E R i+ E ^ E^+E^+E^ 

le[AiUA 2 i>Asi XeL-4iU^ 2 Jn JeL^iJn T&Ai leA 2 X&A 3 

[Ai u A3J n [A 2 u .4 3 J [A 2 ] n L^sJ 



This lemma is a generalization of the cutset bounds to multiple subsets of cuts. Indeed if 



we set A2 = 4> an d A3 = <fi we are left with 



E R ^ E ^ 

which are precisely the cutset bounds. 
Proof. 

> £ ff(X x ) + + E ^( X ^) 

leAi leA 2 leA s 

> H (U TeAl Xj) + H (U JeA2 X T ) + H (Uj^Xi) 

= H (Ujg^u^u^Xx) + / (Ujg^ju^Xj; Uxg^u^Xj; Uj e ^ 2ey 4 3 Xj) 
+ I (Uje^Xj; Ux 6 ^ 2 Xx; Uj g ^ 3 Xi) 

> iif (Uje^ju^au^sjWi) + (UzeL^iU^aJnLAu^JnL^u^aJ Wi) 

+ -H" (UjeL^iJnL^JnL^aJ + e n 

= n( E ^+ E ^+ E ^) 

where the third step follows from lemma 17.21 in the appendix and the fourth from the 
requirement Pe^ —>■ (Fano's inequality) and lemma I7TT1 in the appendix. □ 

Applying lemma 5.1 to the sets of indices in table 1 establishes equation ([I]) for columns 
i = 1,2,3,4,5,6,7,8,9,10,12,13,14,15 of Gbc,3- Unfortunately for column % = 11 the 
condition that either A\ C A2 UA3, A% C A\ U^4 3 or A3 C A1UA2 must hold, is violated. 
Consequently the 11th converse bound is established in a different fashion. 
Let Ai, A2, A3 be defined by the 11th row of table 1. Then 



i 


A 

Ai 


A 2 


A 3 


1 


{1}, {12}, {13}, {123} 


<P 


4> 


2 


{2}, {12}, {23}, {123} 




4> 


3 


{3}, {13}, {23}, {123} 






4 


{1}, {2}, {12}, {13}, {23}, {123} 


4> 


4> 


5 


{1}, {3}, {12}, {13}, {23}, {123} 


4> 


<P 


6 


{2}, {3}, {12}, {13}, {23}, {123} 






7 


{1}, {2}, {3}, {12}, {13}, {23}, {123} 






8 


{1}, {3}, {12}, {13}, {123} 


{2}, {12}, {23}, {123} 


4> 


9 


{1}, {2}, {12}, {13}, {123} 


{3}, {13}, {23}, {123} 


4> 


10 


{1}, {2}, {12}, {23}, {123} 


{3}, {13}, {23}, {123} 




11 


{1}, {12}, {13}, {123} 


{2}, {12}, {23}, {123} 


{3}, {13}, {23} 


12 


{1}, {2}, {12}, {13}, {123} 


{2}, {3}, {12}, {23}, {123} 


{3}, {13}, {23}, {123} 


13 


{1}, {3}, {13}, {23}, {123} 


{1}, {2}, {12}, {23}, {123} 


{1}, {12}, {13}, {123} 


14 


{1}, {2}, {12}, {13}, {123} 


{2}, {12}, {23}, {123} 


{1}, {3}, {13}, {23}, {123} 


15 


{1}, {2}, {12}, {13}, {123} 


{2}, {3}, {12}, {23}, {123} 


{1}, {3}, {13}, {23}, {123} 



Figure 8: The (1,1, l)-multicast region for the broadcast channel, L = 2. 



XZeAi leA 2 leA 3 / 

> J2 H ^i) + E H ^ + E H ^ 

leAi TeA 2 leA 3 

> H (Uxe^Xx) + H (Ux e ^ 2 Xx) + H (U^Xx) 

> H (Ux 6A Xx) + H (Uje^Xi) + H (u Ie ^ u{123} X J ) - ii (X 123 ) 

> H(U IeAl X x \W 1 , W 12 , W 13 , W 123 ) + H{W 1 , W 12 , Wi 3 , W 123 ) 
+ H(U IeA2 X I \W 2 , W 12 , W 23 , W 123 ) + if (W 2 , W 12 , W 23 , W 123 ) 

+ ^(Ux 6 ^u{i23}Xx|W 3 , W 13 , W 23 ) + H(W 3 , W 13 , W 23 ) - if (X 123 ) 

> if (X 123 | W l5 W 12 , Wi 3 , W 123 ) + if (Wx, W 12 , W 13 , W 123 ) 
+ ff (X 123 | W 2 , Wia, W 23 , W 123 ) + if (W 2 , W 12 , W 23 , Wiaa) 
+ if (X 123 |W 3 , W 13 , W 23 ) + if(W 3 , W 13 , W 23 ) - if (X 123 ) 

> if (W 1; W 12 , Wi 3j W 123 ) + H(W 2 , W 12 , W 23 , W 123 ) + if(W 3 , W 13 , W 23 ) 

= if (Wi) + ff (W 2 ) + ff (W 3 ) + 2ff (Wi 2 ) + 2if (Wi 3 ) + 2if (W 23 ) + 2ff (W 123 ) 
= n{Rx + R 2 + R 3 + 2R 12 + 2R 13 + 2R 23 + 2R 123 ) 

where the fourth step follows from the requirement Pen) —>■ (Fano's inequality), the sixth 
from lemma 17.31 and the seventh from the independence of the messages. 

6 Proof of Theorem 4.2 



The direct part of this proof is entirely analogous to the direct part for the broadcast chan- 
nel. This establishes the universal achievability of the R*-multicast region. The converse 



part is different. For each R* we present a sequence of channels. The limiting intersection 
of the capacity regions of these channels is the region in equation (j2J). The capacity regions 
of these channels are not precisely computed, but only outer bounded in a manner sufficient 
to establish their limiting intersection. 



6.1 Direct part 

As this part of the proof is trivial and entirely analogous section 5.1 we only provide a 
sketch. In essence we need to establish that each of the columns of Hmac,3 are achievable 
in the sense of section 5.1. The first column is achieved by transmitting additional M\ 2 
bits on the W\ channel, the second column is achieved by transmitting additional M 13 bits 
on the W\ channel, the third column is achieved by transmitting additional M 12 bits on 
the W 2 channel, and so on. The last column is achieved by lowering the rate of the M 12 3 
message. 



6.2 Converse part 

For each R* we present a sequence of deterministic channels with capacity region tending 
to the region in equation (J2]). The capacity regions of these channels are not explicitly 
computed, only outer bounded, but we show the limiting outer bound is tight. The sequence 
is parameterized by the integer k. 

Let R* be given and assume its elements are rational. Denote their numerators and 
denominators by N x and Dj, for J C {1, 2, 3} so that R* = {N x /D u N 123 /D 123 ). Let 
I = LCM(Di, . . . , -D123). The kth channel is defined as follows. See figure [TU] for a pictorial 
representation. Every k x I time steps the channel takes in a triple of inputs and outputs 
one symbol. The input alphabet is X — X\ x X 2 x X 3 where 

X 1 = {0, \} kNl x {0, \} kNl2 x {0, l} kNia x {0, l} kNl2:i 
X 2 = {0, l} kN2 x {0, l} kNl2 x {0, l} kNl3 x {0, l} kNl2:i 
X 3 = {0, l} kNs x {0, l} kNl3 x {0, l} kN2S x {0, l} fcJVl2a 

The output alphabet is 

y = {0, i} fc7Vi x {0, i} kN2 x {0, i} kNs 

x ({0, l} kNl2 U {e}) x ({0, l} kNl3 U {e}) x ({0, l} kN23 U {e}) x ({0, l} kNl2S U {e}) 

where e is an output symbol that can be thought of as an erasure. The channel thus 
decomposes into one with 4 x 3 = 12 inputs and 7 outputs. The outputs at time i are 
related deterministically to the inputs at time i via 

Ytf) = X ltl (i) 
Y 2 (i) = X 2fl (i) 
Y 3 (()=X 3A (i) 



bitrate of channel 



receiver 1 -< 



receiver 2 -< 



receiver 3 



Xi,i(i) 
X M2 (i) 
X U3 (i) 
X, 123 (i) 



X 2 ,i(i) 
X 2 , 12 (i) 
X 2 , 23 (i) 
X 2 ,i 23 (i) 



I 



R 3 



Ri 



X 3 ,i(i) 
X 3 ,i 3 (i) 

X 3 , 23 (i) S2a_ 

X 3 ,i 23 (i) 




Y 2 (i) 



■> Y 3 (i) 



"> Y 12 (i) 

"> Y 23 (i) 
"* Y 123 (i) 



Figure 9: Illustration of the deterministic multiple access channel used in converse. 



3^23 (i) 



X lil2 (i) if X 1A2 (i) =X 2ti2 (i) 

e otherwise 

X ljl3 (i) ifX ljl3 (i)=^3,i3(i) 

e otherwise 

X 2)23 (i) if ^2,23 (*) = ^3,23 (»") 

e otherwise 



y C ^ = / -^1,123 (*) if -^1,123^) = -^2,123 («) = ^3,123 

123 |^ e otherwise 

The input streams thus consist of blocks of kNi bits. The output streams Yi(i), Y2(i), Y 3 (i) 
match their associated input streams. The output stream Y\2(i) matches its associated 
input streams if and only if the input streams match at each bit, otherwise the erasure 
symbol is outputted. Likewise for the other output streams. For this reason the boxes 
inside the channel in figure [9] are labeled 'coordination channel'. See figure fTUl for a pictorial 
example of one such coordination channel. The idea of the coordination channels is that in 
the limit of large k, they only let common information through. This should be intuitive 
from their definition and from the figure. 

We now bound the capacity region of this channel. It is clear that we can further decompose 
the channel into seven parallel channels, one linking Xi t i and Yi, one linking X 2 ,i and Y 2 , 
one linking X 3j i and Y" 3 , one linking (Xi^, ^2,12) and Yi 2 , one linking (^1,13, -X^,!^) and 
Y13, one linking (X 2>23 , ^3,23) and Y 23 , and one linking (X ljl23 , X 2 ,i 2 3, X 3il23 ) and Y 123 . The 
capacity region of the channel in question is thus the Minkowski sum of the capacity regions 
of these seven channels. Denote these seven capacity regions by Cj for IC {1,2, 3}. Then 




Figure 10: (a) a coordination channel for k — 1. (b) a coordination channel for k = 2. 



the capacity region of our channel is given by 

c k = Yl c i- 

XC{1,2,3} 

where sigma denotes the Minkowski sum. In particular we wish to compute the limiting 
intersection of these regions 

K 

C= lim C]C h 

fc=l 

XC{1,2,3} k=l 

= E &■ 

XC{1,2,3} 

Lemma 6.1. 

1. The region C\ is the set of all R G R+ satisfying R\ + -R12 + -R13 + -R123 < -Ri <™<^ 
Rx = for X G {2, 3, 23} ; 

2. The region C 2 is the set of all R G satisfying R 2 + R\ 2 + -R23 + -R123 < #2 anc ^ 
R I = 0forle {1,3,13}, 

3. The region C 3 the set of all R G satisfying R 3 + R 13 + R 23 + R123 < -R 3 arad 
Ri = for 1 G {1, 2, 12}, 

^. TTie region C12 is the set of all R G IR+ satisfying R\ 2 + -R123 < -R* 2 ari ^ Ri = for 
X G {1,2,3, 13,23}, 

5. T/ie region C\ 3 £/ie sei 0/ a// R G IR+ satisfying R\ 3 + i?i2 3 < -R* 3 and Rj = for 
X G {1,2,3,12,23}, 

5. T/ie region C 23 is the set of all R G satisfying R 23 + Ri 23 < R 23 and Rj = /or 
J G {1,2,3,12,13}, 

7. TTie region C\ 23 is the set of all R G IR+ satisfying R\ 23 < R* l23 and Rj = /or 
J G {1,2,3,12,13,23}. 

Proof. The first three regions are trivial. We establish the fourth. The messages Wj are 
uniformly distributed on {1, ... , 2 nRx } and mutually independent for fixed n. Denote the n- 
length output sequence by Y 12 . By Fano's inequality we must have i?(Uxc{i,2,3}Wx|Y 12 ) < 
e n with e n — > as n — > 00 in order for the error probability to be made arbitrarily small. 
Thus by the mutual independence of the messages we have 

n{R 1 + R 2 + R 3 + R 13 + R 23 ) = H(W 1 ) + H(W 2 ) + H(W 13 ) + H(W 23 ) 

< ^(Uxc { i )2 ,3}Wx, Y 12 ) - H(W 12 , W 123 ) 
</T(Y 12 )-/T(W 12 ,W 123 ) + e„ 
= if(Y 12 |W 12 ,W 123 ) + e n 



rn 1 
12J 



Assume for simplicity that n = mk where m is an integer. We write Y 12 = [Yj 2 , . . . , Y 
where Y\ 2 represents the ith block of k symbols in Y. Similarly X 1 represents the ith block 
of k symbols in X. We also use the shorthand W = {W 12 , W 12 3}. We proceed to show 
that if (Yi 2 |W) is sufficiently small. 

m 

P(Y 12 |W)< ^Pf (Y1 2 |W) 

P(W = w)J2 p ( Y i2 = %\ w = w ) logP(Yi 2 = x\W = w) 



i=i 



m 

EE 

1=1 w 



From the channel definition we have 



P(Y\ 2 = x\W = w) 



P(X* 112 = x, X* il2 = x\ W = w) x + e; 
P(X* ^ X* |W = w) x = e. 



Using this expression and the conditional independence of X^ 12 and X 212 given W we 
have 



-J2 P ( Yi u = x\W = w) logP(Yi 2 = x|W = w) 

= -P(X* >12 ^ X 2jl2 |W = w)logP(Xi )12 ^ X 2jl2 |W = w) 

- 5>(Xl il2 = x|W = w)P(X 2)12 = x\W = w)logP(X* >12 = x|W = w) 

- X^( x i,i2 = x|W = w)P(X 2)12 = x|W = w)logP(X 2)12 = x\W = w). 

The first term can be upper bounded by 1 (as — x\og 2 x < 1 for all x e R). The second 
term can also be upper bounded by 1. To see this, maximize first over the distribution 
-P(X 2) i 2 |W = w) and then over the distribution P(X* 112 |W = w), 



max VP(Xi 

P(X*_ 12 |W = w) 
P ( X l,12l W = w ) X 



12 = x\W = w)P(X 2 , 2 



x 



W = w) logP(Xi 12 = x\W = w) 



max maxPfX, 12 = x|W = w) log maxPfX^ 12 = x\W = w) 

P(X|, 2 |W=w) L x 



< 1 



Likewise the third term can be upper bounded by 1. Thus putting this all together we have 

m 

H(Y 12 \W) <3J2J2 P(W = w ) 



=1 w 



and so 



= 3m 

Pi + P 2 + P 3 + P13 + P 23 < 3m/ n 

= 3/fc 



Then by letting k — > oo we have Rj = for X G {1,2,3, 13,23}. From the structure of 
the coordination channel it is clear that we can achieve points (i?i2,-Ri23) — (-^*2)0) an d 
(i?i2, -R123) = (0, R* 2 )- By time-sharing we can achieve all points in the region R 12 + -R123 < 
_R* 2 . Conversely from Fano's inequality we have 

n(R 12 + #123) = H(W 12 ) + #(W 123 ) 
<tf(Y) 

= n(^ 2 + 5 k ) 

where 5k — > as — >• 00. This establishes the fourth component of the lemma. The 
remaining components are established in the same manner. We omit the details. □ 

It remains to show that the region J2ic{i 2 3} corresponds to the region in equation (j2J). 

7 Appendix 

Lemma 7.1. Let X X ,X 2 and X 3 be three sets of random variables satisfying at least one of 
the properties X\ C X 2 U X 3 , X 2 C X\ U A3 or X 3 C X X U X 2 . Let W be a random variable 
that satisfies H(W\Xi) = for i = 1,2, 3. T/ien 

J(^ a ;Ar 2 ;Af 3 ) > 

Proof. 

I(X 1 ; X 2 ; X 3 ) = I(W, X x - W } X 2 ; W } X 3 ) 

= H(W, X x ) + H(W, X 2 ) + H(W, X 3 ) 

- H{W, Xx, X 2 ) - H{W, X h X 3 ) - H(W, X 2 , X 3 ) + H(W, X u X 2 , X 3 ) 
= H(W) + H{X X \W) + H(X 2 \W) + H(X 3 \W) 

- H{X U X 2 \W)- H(X 1 , X 3 \W)- H{X 2 , X 3 \W) + H(X U X 2 , X 3 \W) 
= H(W) + I(X 1 ;X 2 ;X 3 \W) 

> H{W) 

where the last step follows from lemma 17.41 □ 
Lemma 7.2. 

H{A) + H{B) + H{C) = H(A, B, C) + I(A, B- A, C; B, C) + I(A; B; C) 
Proof. ITIP □ 

Lemma 7.3. Let X x , . . . , X n be a set of mutually independent r.v's. Let X x , X 2 and X 3 be 
three subsets of these r.v. 's with the property X x D X 2 n X 3 — <fi. Then for any r.v. Y 



H{Y\X X ) + H(Y\X 2 ) + H{Y\X 3 ) > H(Y). 



Proof. 



H(Y\X X ) + H(Y\X 2 ) + H{Y\X 3 ) 

> H(Y\X X , X$) + H(X\X 2 , X*) + H(Y\X 3 , X{) 

= H(Y, X x , X 2 C ) + H(Y, X 2 , X<) + H(Y, X 3 , X{) - H(X X , X%) - H(X 2 , X^) - H(X 3 , X{) 
= H(Y, Xi, X 2 , X 3 ) + I(Y, X\, X^\ Y, X 2 , X 3 C ; Y, X 3 , X{) 
+ I\Xi X, X 2 , X 3 ; Y, Xi, X 2 , X 3 \ Y, Xi, X 2 , X 3 ) 

- H(Xi, X 2 ) — H(X 2 , X^) — H(X 3 , X x ) 

= 2H(Y, Xi, X 2 , X 3 ) + I(Y, Xi, X 2 \ Y, X 2 , X^\ Y, X 3 , X{) 

- 2H{X X \X 2 U X 3 ) - 2H(X 2 \X 1 U X 3 ) - 2H(X 3 \X 1 U X 2 ) 

- 2H(x 1 n x 2 \x 3 ) - 2H(x 1 n x 3 \x 2 ) - 2H(x 2 n x 3 \x x ) - 3H(x 1 nx 2 n x 3 ) 

= 2H(Y, Xi, X 2 , X 3 ) + I(Y, X x , X 2 \ Y, X 2 , <Y 3 C ; Y, X 3 , X{) 

- 2H{X X \X 2 U X 3 ) - 2H(X 2 \X X U X 3 ) - 2H(X 3 \X X U X 2 ) 

- 2H(X l n X 2 \X 3 ) - 2H(X 1 n X 3 \X 2 ) - 2H(X 2 n X 3 \Xi) - 2H(X 1 nx 2 n x 3 ) 
= 2H(Y, Xi, X 2 , X 3 ) + I(Y, X x , X 2 \ Y, X 2 , X 3 ; Y, X 3 , X x ) — 2H(X 1 , X 2 , X 3 ) 

> I(Y, Xi, X 2 ; Y, X 2 , Af 3 c ; Y, X 3 , X x ) 
>H(Y) 

where the third step follows from lemma ITT21 the fourth from a set expansion made possible 
by the mutual independence of the underlying r.v.'s X x , . . . , X n , the fifth from the property 
X\ H X 2 n X 3 — <j), the sixth by a set relationship, and the eighth by lemma ITTT1 □ 

Lemma 7.4. Let X X ,X 2 and X 3 be sets of random variables. If either X x C X 2 U X 3 , 
X 2 C Xi U X 3 or X 3 C Xi U X 2 then for any r.v. W , 

I{X\\ X 2 ; X 3 \W) > 0. 

Proof. Assume without loss of generality that the first containment property X 3 C X x U X 2 
holds. Then 

I(X X ,X 2 ,X 3 \W) 

= H{X X \W) + H(X 2 \W) + H(X 3 \W) 

- H{X X , X 2 \W)- H(Xi, X 3 \W)- H{X 2 , X 3 \W) + H{X X , X 2 , X 3 \W) 
= H{X X \W) + H(X 2 \W) + H(X 3 \W) - H(X X , X 2 \W) - H(X X , X 3 \W) 
= I(X X] X 3 \W) + I(X 2 ; X 3 \W) - H{X 3 \W) 

> H(x x n x 3 \w) + H(x 2 n x 3 \w) - H(x x n x 3 , x 2 n x 3 \w) 
= i(x x nx 3] x 2 nx 3 \w) 

> o. 

The second step follows from the containment property X x C X 2 U X 3 . The first term 
in the fourth step follows by applying lemma 17.61 with W = W, X = X x , Y = X 3 and 
Z = X x (1X 3 , the second term by applying the same lemma with W = W, X = X 2 , Y = X 3 
and Z = X 2 C\X 3 . The third term in the third and fourth steps are equal by the containment 
property. □ 



Lemma 7.5. If H(Z\X) =0 and H(Z\Y) = then I(X;Y\W) > H(Z\W). 



Proof. 

I(X; Y\W) = I(X, Z; Y, Z\W) 

= H(X, Z\W) + H(Y, Z\W) - H(X, Y, Z\W) 

= H(Z\W) + H(X\ W, Z) + H(Z\W) + H(Y\W, Z) - H(Z\W) - H(X, Y\W, Z) 
= H(Z\W) + I(X;Y\W,Z) 
> H{Z\W). 

□ 

Lemma 7.6. If H(Z\X) =0 and H(Z\Y) = then I(X;Y\W) > H(Z\W). 
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